Search CORE

47 research outputs found

Multimodal Emotion Recognition from Voice and Video Signals

Author: Mnasri Zied
Publication venue
Publication date: 01/01/2023
Field of study

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Duration modeling using DNN for Arabic speech synthesis

Author: Amal Houidhek
Denis Jouvet
Imene Zangar
Mnasri Zied
Vincent Colotte
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2018
Field of study

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic

Author: Colotte Vincent
Houidhek Amal
Jouvet Denis
Mnasri Zied
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

International audienceThis paper investigates the use of hidden Markov models (HMM) for Modern Standard Arabic speech synthesis. HMM-basedspeech synthesis systems require a description of each speech unit with a set of contextual features that specifies phonetic,phonological and linguistic aspects. To apply this method to Arabic language, a study of its particularities was conductedto extract suitable contextual features. Two phenomena are highlighted: vowel quantity and gemination. This work focuseson how to model geminated consonants (resp. long vowels), either considering them as fully-fledged phonemes or as thesame phonemes as their simple (resp. short) counterparts but with a different duration. Four modelling approaches have beenproposed for this purpose. Results of subjective and objective evaluations show that there is no important difference betweendifferentiating modelling units associated to geminated consonants (resp. long vowels) from modelling units associated tosimple consonants (resp. short vowels) and merging them as long as gemination and vowel quantity information is includedin the set of features

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

INRIA a CCSD electronic archive server

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Duration modeling using DNN for Arabic speech synthesis

Author: Colotte Vincent
Houidhek Amal
Jouvet Denis
Mnasri Zied
Zangar Imene
Publication venue: HAL CCSD
Publication date: 01/01/2018
Field of study

International audienceDuration modeling is a key task for every parametric speech synthesis system. Though such parametric systems have been adapted to many languages, no special attention was paid to explicitly handling Arabic speech characteristics. Actually, in Arabic phoneme duration has a distinctive role, because of consonant gemination and vowel quantity. Therefore, a precise modeling of sound durations is critical. In this paper we compare several modeling of phoneme durations (including duration modeling by HTS and MERLIN toolkits), and we propose a new approach which relies on using a set of models, each one being optimal for a given phoneme class (e.g., simple consonants, geminated consonants, short vowels, and long vowels). An objective evaluation carried out on a set of test sentences shows that the proposed approach leads to a more accurate modeling of the phoneme durations

Crossref

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

INRIA a CCSD electronic archive server

Statistical modelling of speech units in HMM-based speech synthesis for Arabic

Author: Colotte Vincent
Houidhek Amal
Jouvet Denis
Mnasri Zied
Zangar Imene
Publication venue: HAL CCSD
Publication date: 01/01/2017
Field of study

International audienceThis paper investigates statistical parametric speech synthesis of Modern Standard Arabic (MSA). Hidden Markov Models (HMM)-based speech synthesis system relies on a description of speech segments corresponding to phonemes, with a large set of features that represent phonetic, phonologic, linguistic and contextual aspects. When applied to MSA two specific phenomena have to be taken in account, the vowel lengthening and the consonant gemination. This paper studies thoroughly the modeling of these phenomena through various approaches: as for example, the use of different units for modeling short vs. long vowels and the use of different units for modeling simple vs. geminated consonants. These approaches are compared to another one which merges short and long variants of a vowel into a single unit and, simple and geminated variants of a consonant into a single unit (these characteristics being handled through the features associated to the sound). Results of subjective evaluation show that there is no significant difference between using the same unit for simple and geminated consonant (as well as for short and long vowels) and using different units for simple vs. geminated consonants (as well for short vs. long vowels)

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

INRIA a CCSD electronic archive server

Università degli Studi di Napoli L'Orientale: CINECA IRIS

High quality Arabic text-to-speech synthesis using unit selection

Author: Abdelmalek Raja
Mnasri Zied
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study